Biostatistics For Dummies (Monika Wahi John Pezzullo)

As mentioned earlier, the purpose of taking measurements from a sample of a population is so that you

can use it to perform inferential statistics, which enables you to make estimates about the population

without having to measure the entire population. Theoretically, you want the statistics from your

sample to be as close as possible to the population parameters you are trying to estimate. To increase

the likelihood that this happens, you should try your best to draw a sample that is representative of the

population.

You may be wondering, “What is the best way to draw a sample that is representative of the

background population?” The honest answer is, “It depends on your resources.” If you are a

government agency, you can invest a lot of resources in conducting representative sampling from a

population for your studies. But if you are a graduate student working on a dissertation, then based on

resources available, you probably have to settle for a sample that is not as representative of the

population as a government agency could afford. Nevertheless, you can still use your judgment to make

the wisest decisions possible about your sampling approach.

Taking a simple random sample

Taking a simple random sample (SRS) is considered a representative approach to sampling from a

background population. In an SRS, every member of the population has an equal chance of being

selected randomly and included in the sample. As an example, recall the printout of the current patient

list from a clinic discussed in the previous section. Considering that list a clinical population, imagine

that you used scissors to cut the list up so that each name was on its own slip of paper, and then you put

all the slips of paper into a hat. If you want to take an SRS of 20 patients, you could randomly remove

20 names from the hat. The SRS would be seen as a highly representative sample.

In practice, an SRS is usually taken using a computer so that you can take advantage of a

random number generator (RNG) (and do not have to cut up all that paper). Imagine that the

patient list from which you were sampling was not printed on paper, but was instead stored in a

column in a spreadsheet in Microsoft Excel. You could use the following steps to take an SRS of

20 patients from this list using the computer:

1. Create a column containing random numbers.

You could create another column in the spreadsheet called “Random” and enter the following

formula into the top cell in the column: =RAND(). If you drag that cell down so that the entire

column contains this command, you will see that Excel populates each cell with a random number

between 0 and 1. Each time Excel evaluates, the random number gets recalculated.

2. Sort the list by the random number column.

3. Select the top 20 rows from the list.

This process ensures that your sample of 20 patients was taken completely at random. Statistical

packages like those described in Chapter 4 have RNG commands similar to the one in Excel.